On Combining Wavelets Expansion and Sparse Linear Models for Regression on Metabolomic Data and Biomarker Selection

نویسندگان

Nathalie Villa-Vialaneix

Noslén Hernández-González

Alain Paris

Céline Domange

Nathalie Priymenko

Philippe Besse

چکیده

∗[email protected], Corresponding author 1 V er si on p re pr in t Comment citer ce document : Villa Vialaneix, N., Hernandez, N., Paris, A., Domange, C., Priymenko, N., Besse, P. (2016). On combining wavelets expansion and sparse linear models for regression on metabolomic data and biomarker selection. Communications in Statistics Simulation and Computation, 45 (1), 282-298. DOI : 10.1080/03610918.2013.862273 Wavelet thresholding of spectra has to be handled with care when the spectra are the predictors of a regression problem. Indeed, a blind thresholding of the signal followed by a regression method often leads to deteriorated predictions. The scope of this paper is to show that sparse regression methods, applied in the wavelet domain, perform an automatic thresholding: the most relevant wavelet coefficients are selected to optimize the prediction of a given target of interest. This approach can be seen as a joint thresholding designed for a predictive purpose. The method is illustrated on a real world problem where metabolomic data is linked to poison ingestion. This example proves the usefulness of wavelet expansion and the good behavior of sparse and regularized methods. A comparison study is performed between the two-steps approach (wavelet thresholding and regression) and the one-step approach (selection of wavelet coefficients with a sparse regression). The comparison includes two types of wavelet bases, various thresholding methods and various regression methods and is evaluated by calculating prediction performances. Information about the location of the most important features on the spectra was also obtained and used to identify the most relevant metabolites involved in the mice poisoning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method

Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...

متن کامل

CAS WAVELET METHOD FOR THE NUMERICAL SOLUTION OF BOUNDARY INTEGRAL EQUATIONS WITH LOGARITHMIC SINGULAR KERNELS

In this paper, we present a computational method for solving boundary integral equations with loga-rithmic singular kernels which occur as reformulations of a boundary value problem for the Laplacian equation. Themethod is based on the use of the Galerkin method with CAS wavelets constructed on the unit interval as basis.This approach utilizes the non-uniform Gauss-Legendre quadrature rule for ...

متن کامل

Relevance vector machine and multivariate adaptive regression spline for modelling ultimate capacity of pile foundation

This study examines the capability of the Relevance Vector Machine (RVM) and Multivariate Adaptive Regression Spline (MARS) for prediction of ultimate capacity of driven piles and drilled shafts. RVM is a sparse method for training generalized linear models, while MARS technique is basically an adaptive piece-wise regression approach. In this paper, pile capacity prediction models are developed...

متن کامل

Penalized Bregman Divergence Estimation via Coordinate Descent

Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Communications in Statistics - Simulation and Computation

دوره 45 شماره

صفحات -

تاریخ انتشار 2016

On Combining Wavelets Expansion and Sparse Linear Models for Regression on Metabolomic Data and Biomarker Selection

نویسندگان

چکیده

منابع مشابه

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method

CAS WAVELET METHOD FOR THE NUMERICAL SOLUTION OF BOUNDARY INTEGRAL EQUATIONS WITH LOGARITHMIC SINGULAR KERNELS

Relevance vector machine and multivariate adaptive regression spline for modelling ultimate capacity of pile foundation

Penalized Bregman Divergence Estimation via Coordinate Descent

عنوان ژورنال:

اشتراک گذاری